Overview

Dataset statistics

Number of variables11
Number of observations19020
Missing cells0
Missing cells (%)0.0%
Duplicate rows115
Duplicate rows (%)0.6%
Total size in memory1.6 MiB
Average record size in memory88.0 B

Variable types

Numeric10
Categorical1

Alerts

Dataset has 115 (0.6%) duplicate rowsDuplicates
fLength is highly correlated with fWidth and 7 other fieldsHigh correlation
fWidth is highly correlated with fLength and 6 other fieldsHigh correlation
fSize is highly correlated with fLength and 4 other fieldsHigh correlation
fConc is highly correlated with fLength and 4 other fieldsHigh correlation
fConc1 is highly correlated with fLength and 4 other fieldsHigh correlation
fAsym is highly correlated with fLength and 2 other fieldsHigh correlation
fM3Long is highly correlated with fLength and 5 other fieldsHigh correlation
fM3Trans is highly correlated with fLength and 1 other fieldsHigh correlation
fAlpha is highly correlated with classHigh correlation
fDist is highly correlated with fLengthHigh correlation
class is highly correlated with fAlphaHigh correlation

Reproduction

Analysis started2022-11-15 17:55:05.316638
Analysis finished2022-11-15 17:55:39.199396
Duration33.88 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

fLength
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18643
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.25015393
Minimum4.2835
Maximum334.177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:39.592885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum4.2835
5-th percentile16.433655
Q124.336
median37.1477
Q370.122175
95-th percentile139.72515
Maximum334.177
Range329.8935
Interquartile range (IQR)45.786175

Descriptive statistics

Standard deviation42.36485494
Coefficient of variation (CV)0.7955818306
Kurtosis4.970441241
Mean53.25015393
Median Absolute Deviation (MAD)16.32565
Skewness2.013652324
Sum1012817.928
Variance1794.780934
MonotonicityNot monotonic
2022-11-15T23:25:39.921354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20.75223
 
< 0.1%
24.83323
 
< 0.1%
26.91873
 
< 0.1%
19.15723
 
< 0.1%
12.91763
 
< 0.1%
98.29682
 
< 0.1%
32.29992
 
< 0.1%
84.57142
 
< 0.1%
24.89522
 
< 0.1%
12.47632
 
< 0.1%
Other values (18633)18995
99.9%
ValueCountFrequency (%)
4.28351
< 0.1%
7.20791
< 0.1%
7.36061
< 0.1%
8.05181
< 0.1%
8.23041
< 0.1%
8.23111
< 0.1%
8.48021
< 0.1%
8.57381
< 0.1%
8.6011
< 0.1%
8.69981
< 0.1%
ValueCountFrequency (%)
334.1771
< 0.1%
310.611
< 0.1%
305.4221
< 0.1%
305.3241
< 0.1%
305.09611
< 0.1%
303.56761
< 0.1%
303.27871
< 0.1%
299.93041
< 0.1%
297.12391
< 0.1%
295.6721
< 0.1%

fWidth
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18200
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.18096622
Minimum0
Maximum256.382
Zeros98
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:40.265721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.4005
Q111.8638
median17.1399
Q324.739475
95-th percentile58.479245
Maximum256.382
Range256.382
Interquartile range (IQR)12.875675

Descriptive statistics

Standard deviation18.3460563
Coefficient of variation (CV)0.8271080761
Kurtosis16.76540668
Mean22.18096622
Median Absolute Deviation (MAD)5.87145
Skewness3.371627981
Sum421881.9775
Variance336.5777816
MonotonicityNot monotonic
2022-11-15T23:25:40.845672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
098
 
0.5%
10.75394
 
< 0.1%
0.00013
 
< 0.1%
10.03423
 
< 0.1%
15.86443
 
< 0.1%
0.00293
 
< 0.1%
9.55133
 
< 0.1%
0.00333
 
< 0.1%
20.20213
 
< 0.1%
12.81553
 
< 0.1%
Other values (18190)18894
99.3%
ValueCountFrequency (%)
098
0.5%
0.00013
 
< 0.1%
0.00021
 
< 0.1%
0.00061
 
< 0.1%
0.00191
 
< 0.1%
0.00252
 
< 0.1%
0.00262
 
< 0.1%
0.00271
 
< 0.1%
0.00283
 
< 0.1%
0.00293
 
< 0.1%
ValueCountFrequency (%)
256.3821
< 0.1%
228.03851
< 0.1%
220.51441
< 0.1%
201.3641
< 0.1%
190.54321
< 0.1%
190.1391
< 0.1%
188.88661
< 0.1%
186.9281
< 0.1%
179.29241
< 0.1%
177.7821
< 0.1%

fSize
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7228
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.825016961
Minimum1.9413
Maximum5.3233
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:41.143081image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.9413
5-th percentile2.1945
Q12.4771
median2.7396
Q33.1016
95-th percentile3.71575
Maximum5.3233
Range3.382
Interquartile range (IQR)0.6245

Descriptive statistics

Standard deviation0.4725986487
Coefficient of variation (CV)0.1672905527
Kurtosis0.7272784359
Mean2.825016961
Median Absolute Deviation (MAD)0.29895
Skewness0.8755071709
Sum53731.8226
Variance0.2233494827
MonotonicityNot monotonic
2022-11-15T23:25:41.476320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.150827
 
0.1%
2.128724
 
0.1%
2.077424
 
0.1%
2.131923
 
0.1%
2.141422
 
0.1%
2.313922
 
0.1%
2.135122
 
0.1%
2.393621
 
0.1%
2.2921
 
0.1%
2.358920
 
0.1%
Other values (7218)18794
98.8%
ValueCountFrequency (%)
1.94131
 
< 0.1%
1.94681
 
< 0.1%
1.99161
 
< 0.1%
1.99781
 
< 0.1%
2.00221
 
< 0.1%
2.00652
 
< 0.1%
2.01073
 
< 0.1%
2.01494
< 0.1%
2.01911
 
< 0.1%
2.02338
< 0.1%
ValueCountFrequency (%)
5.32331
< 0.1%
5.17951
< 0.1%
5.14671
< 0.1%
5.01181
< 0.1%
5.011
< 0.1%
4.99461
< 0.1%
4.95181
< 0.1%
4.93691
< 0.1%
4.9051
< 0.1%
4.85011
< 0.1%

fConc
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6410
Distinct (%)33.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3803270715
Minimum0.0131
Maximum0.893
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:41.835680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.0131
5-th percentile0.1263
Q10.2358
median0.35415
Q30.5037
95-th percentile0.734205
Maximum0.893
Range0.8799
Interquartile range (IQR)0.2679

Descriptive statistics

Standard deviation0.1828131472
Coefficient of variation (CV)0.4806735069
Kurtosis-0.5212970988
Mean0.3803270715
Median Absolute Deviation (MAD)0.13025
Skewness0.4858884539
Sum7233.8209
Variance0.0334206468
MonotonicityNot monotonic
2022-11-15T23:25:42.149627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.616
 
0.1%
0.412
 
0.1%
0.411612
 
0.1%
0.297912
 
0.1%
0.217511
 
0.1%
0.221411
 
0.1%
0.511
 
0.1%
0.615411
 
0.1%
0.19311
 
0.1%
0.240811
 
0.1%
Other values (6400)18902
99.4%
ValueCountFrequency (%)
0.01311
< 0.1%
0.01331
< 0.1%
0.01371
< 0.1%
0.01392
< 0.1%
0.01581
< 0.1%
0.01621
< 0.1%
0.01711
< 0.1%
0.01881
< 0.1%
0.01961
< 0.1%
0.02061
< 0.1%
ValueCountFrequency (%)
0.8931
< 0.1%
0.89121
< 0.1%
0.88891
< 0.1%
0.88461
< 0.1%
0.87861
< 0.1%
0.87781
< 0.1%
0.87721
< 0.1%
0.87571
< 0.1%
0.87451
< 0.1%
0.87431
< 0.1%

fConc1
Real number (ℝ≥0)

HIGH CORRELATION

Distinct4421
Distinct (%)23.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2146571346
Minimum0.0003
Maximum0.6752
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:42.478663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.0003
5-th percentile0.066995
Q10.128475
median0.1965
Q30.285225
95-th percentile0.42241
Maximum0.6752
Range0.6749
Interquartile range (IQR)0.15675

Descriptive statistics

Standard deviation0.1105107989
Coefficient of variation (CV)0.5148247185
Kurtosis0.0293910244
Mean0.2146571346
Median Absolute Deviation (MAD)0.0754
Skewness0.6856946259
Sum4082.7787
Variance0.01221263667
MonotonicityNot monotonic
2022-11-15T23:25:42.796070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.19418
 
0.1%
0.212616
 
0.1%
0.193916
 
0.1%
0.216
 
0.1%
0.21715
 
0.1%
0.225115
 
0.1%
0.151514
 
0.1%
0.150414
 
0.1%
0.127914
 
0.1%
0.156814
 
0.1%
Other values (4411)18868
99.2%
ValueCountFrequency (%)
0.00031
< 0.1%
0.00081
< 0.1%
0.00111
< 0.1%
0.00151
< 0.1%
0.0021
< 0.1%
0.00471
< 0.1%
0.0051
< 0.1%
0.00721
< 0.1%
0.00731
< 0.1%
0.00761
< 0.1%
ValueCountFrequency (%)
0.67521
< 0.1%
0.6741
< 0.1%
0.6431
< 0.1%
0.6371
< 0.1%
0.62961
< 0.1%
0.62831
< 0.1%
0.62641
< 0.1%
0.62421
< 0.1%
0.62241
< 0.1%
0.62041
< 0.1%

fAsym
Real number (ℝ)

HIGH CORRELATION

Distinct18704
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.331745158
Minimum-457.9161
Maximum575.2407
Zeros41
Zeros (%)0.2%
Negative8448
Negative (%)44.4%
Memory size148.7 KiB
2022-11-15T23:25:43.144732image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-457.9161
5-th percentile-111.1947
Q1-20.58655
median4.01305
Q324.0637
95-th percentile65.544125
Maximum575.2407
Range1033.1568
Interquartile range (IQR)44.65025

Descriptive statistics

Standard deviation59.20606198
Coefficient of variation (CV)-13.66794671
Kurtosis8.155329763
Mean-4.331745158
Median Absolute Deviation (MAD)21.68065
Skewness-1.046441472
Sum-82389.7929
Variance3505.357776
MonotonicityNot monotonic
2022-11-15T23:25:43.458086image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
041
 
0.2%
-0.00017
 
< 0.1%
8.80773
 
< 0.1%
7.10883
 
< 0.1%
-1.47613
 
< 0.1%
-0.50623
 
< 0.1%
152
 
< 0.1%
36.66312
 
< 0.1%
-2.06512
 
< 0.1%
58.61842
 
< 0.1%
Other values (18694)18952
99.6%
ValueCountFrequency (%)
-457.91611
< 0.1%
-449.95261
< 0.1%
-382.5941
< 0.1%
-381.7341
< 0.1%
-378.94571
< 0.1%
-368.6331
< 0.1%
-363.33821
< 0.1%
-353.9341
< 0.1%
-353.261
< 0.1%
-349.7571
< 0.1%
ValueCountFrequency (%)
575.24071
< 0.1%
473.06541
< 0.1%
464.6311
< 0.1%
444.4011
< 0.1%
433.09571
< 0.1%
402.9251
< 0.1%
402.18631
< 0.1%
400.2841
< 0.1%
396.33791
< 0.1%
384.34771
< 0.1%

fM3Long
Real number (ℝ)

HIGH CORRELATION

Distinct18693
Distinct (%)98.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.54554482
Minimum-331.78
Maximum238.321
Zeros39
Zeros (%)0.2%
Negative6604
Negative (%)34.7%
Memory size148.7 KiB
2022-11-15T23:25:43.793896image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-331.78
5-th percentile-80.28369
Q1-12.842775
median15.3141
Q335.8378
95-th percentile83.07177
Maximum238.321
Range570.101
Interquartile range (IQR)48.680575

Descriptive statistics

Standard deviation51.00011801
Coefficient of variation (CV)4.836176689
Kurtosis4.670973798
Mean10.54554482
Median Absolute Deviation (MAD)25.33365
Skewness-1.123078055
Sum200576.2624
Variance2601.012037
MonotonicityNot monotonic
2022-11-15T23:25:44.104622image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
039
 
0.2%
-0.00014
 
< 0.1%
16.07473
 
< 0.1%
-10.73012
 
< 0.1%
20.17232
 
< 0.1%
-18.54092
 
< 0.1%
54.472
 
< 0.1%
22.6492
 
< 0.1%
-18.25352
 
< 0.1%
14.96562
 
< 0.1%
Other values (18683)18960
99.7%
ValueCountFrequency (%)
-331.781
< 0.1%
-318.30021
< 0.1%
-297.17171
< 0.1%
-293.17621
< 0.1%
-287.50671
< 0.1%
-287.36361
< 0.1%
-284.70381
< 0.1%
-281.95411
< 0.1%
-281.8441
< 0.1%
-281.4351
< 0.1%
ValueCountFrequency (%)
238.3211
< 0.1%
231.4461
< 0.1%
227.81741
< 0.1%
226.35061
< 0.1%
222.4171
< 0.1%
217.9341
< 0.1%
217.6241
< 0.1%
216.9851
< 0.1%
215.8941
< 0.1%
203.8631
< 0.1%

fM3Trans
Real number (ℝ)

HIGH CORRELATION

Distinct18390
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2497259569
Minimum-205.8947
Maximum179.851
Zeros59
Zeros (%)0.3%
Negative9404
Negative (%)49.4%
Memory size148.7 KiB
2022-11-15T23:25:44.436694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-205.8947
5-th percentile-25.76384
Q1-10.849375
median0.6662
Q310.946425
95-th percentile26.99851
Maximum179.851
Range385.7457
Interquartile range (IQR)21.7958

Descriptive statistics

Standard deviation20.82743895
Coefficient of variation (CV)83.40117786
Kurtosis8.580352473
Mean0.2497259569
Median Absolute Deviation (MAD)10.888
Skewness0.1201212735
Sum4749.7877
Variance433.7822131
MonotonicityNot monotonic
2022-11-15T23:25:44.751657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
059
 
0.3%
-0.000124
 
0.1%
0.000118
 
0.1%
-5.44543
 
< 0.1%
-7.66013
 
< 0.1%
11.16023
 
< 0.1%
6.18293
 
< 0.1%
-8.9753
 
< 0.1%
10.90153
 
< 0.1%
9.52313
 
< 0.1%
Other values (18380)18898
99.4%
ValueCountFrequency (%)
-205.89471
< 0.1%
-164.141
< 0.1%
-149.55131
< 0.1%
-142.58941
< 0.1%
-142.1191
< 0.1%
-135.50511
< 0.1%
-134.751
< 0.1%
-134.3951
< 0.1%
-133.13591
< 0.1%
-132.4161
< 0.1%
ValueCountFrequency (%)
179.8511
< 0.1%
170.6921
< 0.1%
163.26971
< 0.1%
154.8651
< 0.1%
143.87531
< 0.1%
139.23611
< 0.1%
132.5891
< 0.1%
132.3881
< 0.1%
131.55471
< 0.1%
130.85451
< 0.1%

fAlpha
Real number (ℝ≥0)

HIGH CORRELATION

Distinct17981
Distinct (%)94.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.64570668
Minimum0
Maximum90
Zeros5
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:45.086546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.933285
Q15.547925
median17.6795
Q345.88355
95-th percentile80.72654
Maximum90
Range90
Interquartile range (IQR)40.335625

Descriptive statistics

Standard deviation26.10362051
Coefficient of variation (CV)0.9442196872
Kurtosis-0.5337036036
Mean27.64570668
Median Absolute Deviation (MAD)14.6924
Skewness0.8508898774
Sum525821.341
Variance681.3990037
MonotonicityNot monotonic
2022-11-15T23:25:45.409675image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.00027
 
< 0.1%
05
 
< 0.1%
0.3864
 
< 0.1%
1.294
 
< 0.1%
904
 
< 0.1%
0.8044
 
< 0.1%
0.2564
 
< 0.1%
3.41614
 
< 0.1%
2.764
 
< 0.1%
2.7014
 
< 0.1%
Other values (17971)18976
99.8%
ValueCountFrequency (%)
05
< 0.1%
0.00027
< 0.1%
0.00032
 
< 0.1%
0.0011
 
< 0.1%
0.00311
 
< 0.1%
0.00561
 
< 0.1%
0.00861
 
< 0.1%
0.0091
 
< 0.1%
0.00971
 
< 0.1%
0.01031
 
< 0.1%
ValueCountFrequency (%)
904
< 0.1%
89.97981
 
< 0.1%
89.95791
 
< 0.1%
89.95351
 
< 0.1%
89.95281
 
< 0.1%
89.92291
 
< 0.1%
89.91551
 
< 0.1%
89.90871
 
< 0.1%
89.90761
 
< 0.1%
89.90421
 
< 0.1%

fDist
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18437
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean193.8180265
Minimum1.2826
Maximum495.561
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size148.7 KiB
2022-11-15T23:25:45.731343image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.2826
5-th percentile71.41369
Q1142.49225
median191.85145
Q3240.563825
95-th percentile326.659975
Maximum495.561
Range494.2784
Interquartile range (IQR)98.071575

Descriptive statistics

Standard deviation74.73178696
Coefficient of variation (CV)0.3855770711
Kurtosis-0.112576594
Mean193.8180265
Median Absolute Deviation (MAD)49.0165
Skewness0.2295873764
Sum3686418.863
Variance5584.839983
MonotonicityNot monotonic
2022-11-15T23:25:46.036443image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
182.0133
 
< 0.1%
227.1073
 
< 0.1%
265.2383
 
< 0.1%
146.3543
 
< 0.1%
186.8283
 
< 0.1%
168.7743
 
< 0.1%
216.0323
 
< 0.1%
187.6513
 
< 0.1%
148.3723
 
< 0.1%
100.3953
 
< 0.1%
Other values (18427)18990
99.8%
ValueCountFrequency (%)
1.28261
< 0.1%
5.54491
< 0.1%
5.59221
< 0.1%
5.69981
< 0.1%
5.74561
< 0.1%
6.5641
< 0.1%
6.68521
< 0.1%
9.15741
< 0.1%
13.11081
< 0.1%
14.02291
< 0.1%
ValueCountFrequency (%)
495.5611
< 0.1%
466.40781
< 0.1%
450.9531
< 0.1%
450.4021
< 0.1%
450.3491
< 0.1%
448.02951
< 0.1%
446.4881
< 0.1%
438.9011
< 0.1%
438.85741
< 0.1%
437.4771
< 0.1%

class
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size148.7 KiB
g
12332 
h
6688 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters19020
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowg
2nd rowg
3rd rowg
4th rowg
5th rowg

Common Values

ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Length

2022-11-15T23:25:46.303477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-15T23:25:46.620188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Most occurring characters

ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter19020
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Most occurring scripts

ValueCountFrequency (%)
Latin19020
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII19020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g12332
64.8%
h6688
35.2%

Interactions

2022-11-15T23:25:35.730449image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:11.167659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:13.963456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:16.606488image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:19.540000image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:22.097971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:24.877933image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:27.526615image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:30.391980image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:33.122616image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:35.996584image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:11.559964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:14.245971image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:16.904714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:19.822129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:22.380278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:25.159868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:27.808223image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:30.673371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:33.405540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:36.218649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:11.812292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:14.513379image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:17.169760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:20.088106image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:22.663068image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:25.411785image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:28.058226image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:30.970568image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:33.641042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:36.483861image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:12.064927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:14.765042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:17.453320image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:20.323277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:22.946222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:25.693769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:28.340151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:31.251776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:33.923875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:36.734920image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:12.300079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:15.031462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:17.705249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:20.574580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:23.214630image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:25.928090image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:28.810546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:31.518760image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:34.158246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:37.017231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:12.583935image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:15.316213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:17.985828image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:20.856946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:23.496769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:26.209386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:29.091809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:31.802225image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:34.443032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:37.283015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:12.851177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:15.567008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:18.270409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:21.045346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:23.764051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:26.459463image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:29.342247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:32.069198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:34.693469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:37.534386image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:13.133319image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:15.820706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:18.738297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:21.296249image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:24.047009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:26.725672image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:29.608501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:32.307312image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:34.943680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:37.770886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:13.430627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:16.087614image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:19.023144image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:21.579557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:24.328723image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:27.007475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:29.889957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:32.604496image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:35.210036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:38.005295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:13.696605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:16.336529image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:19.273612image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:21.830708image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:24.596021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:27.258922image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:30.134397image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:32.855738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-15T23:25:35.461042image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-11-15T23:25:46.810547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-15T23:25:47.142903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-15T23:25:47.489204image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-15T23:25:47.839514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-15T23:25:38.396585image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-15T23:25:38.899446image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

fLengthfWidthfSizefConcfConc1fAsymfM3LongfM3TransfAlphafDistclass
028.796716.00212.64490.39180.198227.700422.0110-8.202740.092081.8828g
131.603611.72352.51850.53030.377326.272223.8238-9.95746.3609205.2610g
2162.0520136.03104.06120.03740.0187116.7410-64.8580-45.216076.9600256.7880g
323.81729.57282.33850.61470.392227.2107-6.4633-7.151310.4490116.7370g
475.136230.92053.16110.31680.1832-5.527728.552521.83934.6480356.4620g
551.624021.15022.90850.24200.134050.876143.18879.81453.6130238.0980g
648.246817.35653.03320.25290.15158.573038.095710.58684.7920219.0870g
726.789713.75952.55210.42360.217429.633920.4560-2.92920.8120237.1340g
896.232746.51654.15400.07790.0390110.355085.048643.18444.8540248.2260g
946.761915.19932.57860.33770.191324.754843.8771-6.68127.8750102.2510g

Last rows

fLengthfWidthfSizefConcfConc1fAsymfM3LongfM3TransfAlphafDistclass
1901032.490210.67232.47420.46640.2735-27.0097-21.16878.481369.1730120.6680h
1901179.552844.99293.54880.16560.0900-39.621353.7866-30.005415.8075311.5680h
1901231.837313.87342.82510.41690.1988-16.4919-27.144811.109811.3663100.0566h
19013182.500376.55683.68720.11230.0666192.267593.0302-62.619282.1691283.4731h
1901443.298017.35452.83070.28770.1646-60.1842-33.8513-3.654578.4099224.8299h
1901521.384610.91702.61610.58570.393415.261811.52452.87662.4229106.8258h
1901628.94526.70202.26720.53510.278437.081613.1853-2.963286.7975247.4560h
1901775.445547.53053.44830.14170.0549-9.356141.0562-9.466230.2987256.5166h
19018120.513576.90183.99390.09440.06835.8043-93.5224-63.838984.6874408.3166h
19019187.181453.00143.20930.28760.1539-167.3125-168.455831.475552.7310272.3174h

Duplicate rows

Most frequently occurring

fLengthfWidthfSizefConcfConc1fAsymfM3LongfM3TransfAlphafDistclass# duplicates
012.917611.35962.11230.74130.390015.0388-5.6768-11.563864.9330227.1070h2
112.980110.88152.41750.74570.4723-13.69706.0371-7.001930.803078.2618h2
213.028710.95442.20000.75710.4511-14.09855.7807-10.174864.8700182.9800h2
314.791211.79552.30750.67490.45571.35334.7675-9.061162.250062.5245h2
416.756611.30632.37660.58400.35500.00000.15436.741948.5040117.6360h2
516.989411.00022.45640.62940.3514-3.49028.0823-7.051655.393091.3761h2
618.434317.87172.38470.48660.2701-15.7044-16.5170-12.231171.0730158.7030h2
718.49149.76352.48290.65790.3734-1.80607.65206.726033.8161188.8670h2
818.809011.13052.54960.60370.42450.5645-3.060811.817675.5740222.5910h2
919.084813.73462.58490.60080.396617.282419.4696-5.23905.3161213.7140h2